Improving the Performance of Morton Layout by Array Alignment and Loop Unrolling: Reducing the Price of Naivety
نویسندگان
چکیده
Hierarchically-blocked non-linear storage layouts, such as the Morton ordering, have been proposed as a compromise between row-major and columnmajor for two-dimensional arrays. Morton layout offers some spatial locality whether traversed row-wise or column-wise. The goal of this paper is to make this an attractive compromise, offering close to the performance of row-major traversal of row-major layout, while avoiding the pathological behaviour of columnmajor traversal. We explore how spatial locality of Morton layout depends on the alignment of the array’s base address, and how unrolling has to be aligned to reduce address calculation overhead. We conclude with extensive experimental results using five common processors and a small suite of benchmark kernels.
منابع مشابه
Non-linear memory layout transformations and data prefetching techniques to exploit locality of references for modern microprocessor architectures with multilayered memory hierarchies PHD THESIS
One of the key challenges computer architects and compiler writers are facing, is the increasing discrepancy between processor cycle times and main memory access times. To overcome this problem, program transformations that decrease cache misses are used, to reduce average latency for memory accesses. Tiling is a widely used loop iteration reordering technique for improving locality of referenc...
متن کاملAlternative array storage layouts for regular scientific programs
This thesis concerns techniques for using hierarchical storage formats such as Morton layout as an alternative storage layout for regular scientific programs operating over dense two-dimensional arrays. Programming languages with support for two-dimensional arrays use one of two linear mappings from two-dimensional array indices to locations in the machine’s one-dimensional address space: rowma...
متن کاملAn Aggressive Approach to Loop Unrolling
A well-known code transformation for improving the execution performance of a program is loop unrolling. The most obvious benefit of unrolling a loop is that the transformed loop usually, but not always, requires fewer instruction executions than the original loop. The reduction in instruction executions comes from two sources: the number of branch instructions executed is reduced, and the inde...
متن کاملIs Morton layout competitive for large two-dimensional arrays yet?
Two-dimensional arrays are generally arranged in memory in row-major order or column-major order. Traversing a row-major array in column-major order, or vice-versa, leads to poor spatial locality. With large arrays the performance loss can be a factor of 10 or more. This paper explores the Morton storage layout, which has substantial spatial locality whether traversed in row-major or column-maj...
متن کاملAnalytical Study of Optical Bi-Stability of a Single-Bus Resonator Based on InGaAs Micro-Ring Array
In this paper, for the first time to our knowledge, we investigate the optical bi-stability in a compact parallel array of micro- ring resonators with 5μm radius, induced by optical nonlinearity. Due to the nature of perfect light confinement, resonance and accumulation process in a ring resonator, optical nonlinear effects, even at small optical power of a few milliwatts in this structure are ...
متن کامل